Airbender predictive text
Our modeling goal is to predict the speaker of each line of dialogue.
https://juliasilge.com/blog/last-airbender/
library(tidyverse)
Registered S3 methods overwritten by 'dbplyr':
method from
print.tbl_lazy
print.tbl_sql
── Attaching packages ──────────────────── tidyverse 1.3.0 ──
✓ ggplot2 3.3.2 ✓ purrr 0.3.4
✓ tibble 3.0.4 ✓ dplyr 1.0.2
✓ tidyr 1.1.2 ✓ stringr 1.4.0
✓ readr 1.4.0 ✓ forcats 0.5.0
── Conflicts ─────────────────────── tidyverse_conflicts() ──
x dplyr::filter() masks stats::filter()
x dplyr::lag() masks stats::lag()
avatar_raw <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-08-11/avatar.csv")
── Column specification ─────────────────────────────────────
cols(
id = col_double(),
book = col_character(),
book_num = col_double(),
chapter = col_character(),
chapter_num = col_double(),
character = col_character(),
full_text = col_character(),
character_words = col_character(),
writer = col_character(),
director = col_character(),
imdb_rating = col_double()
)
avatar_raw %>%
count(character, sort = TRUE)
Rows with Scene Description are not dialogue; the main character Aang speaks the most lines overall. How does this change through the three “books” of the show?
library(tidytext)
avatar_raw %>%
filter(!is.na(character_words)) %>%
mutate(
book = fct_inorder(book),
character = fct_lump_n(character, 10)
) %>%
count(book, character) %>%
mutate(character = reorder_within(character, n, book)) %>%
ggplot(aes(n, character, fill = book)) +
geom_col(show.legend = FALSE) +
facet_wrap(~book, scales = "free") +
scale_y_reordered() +
labs(y = NULL)

Let’s create a dataset for our modeling question, and look at a few example lines.
avatar <- avatar_raw %>%
filter(!is.na(character_words)) %>%
mutate(aang = if_else(character == "Aang", "Aang", "Other")) %>%
select(aang, book, text = character_words)
avatar %>%
filter(aang == "Aang") %>%
sample_n(10) %>%
pull(text)
[1] "Sorry, ma'am."
[2] "Katara, whoever's in there might help figure out this Avatar thing!"
[3] "I'm gonna throw them, a secret dance party!"
[4] "Come on, Appa! The boat! There!"
[5] "I like jokes."
[6] "Watch this, everybody!"
[7] "Sokka, go back!"
[8] "So where am I, Roku? What is this place?"
[9] "Appa's exhausted."
[10] "Where's Appa?"
What are the highest log odds words from Aang and other speakers?
library(tidytext)
library(tidylo)
avatar_lo <- avatar %>%
unnest_tokens(word, text) %>%
count(aang, word) %>%
bind_log_odds(aang, word, n) %>%
arrange(-log_odds_weighted)
avatar_lo %>%
group_by(aang) %>%
top_n(15) %>%
ungroup() %>%
mutate(word = reorder(word, log_odds_weighted)) %>%
ggplot(aes(log_odds_weighted, word, fill = aang)) +
geom_col(alpha = 0.8, show.legend = FALSE) +
facet_wrap(~aang, scales = "free") +
labs(y = NULL)
Selecting by log_odds_weighted

These words make sense, but the counts are probably too low to build a good model with. Instead, let’s try using text features like the number of punctuation characters, number of pronons, and so forth.
library(textfeatures)
tf <- textfeatures(
avatar,
sentiment = FALSE, word_dims = 0,
normalize = FALSE, verbose = FALSE
)
tf %>%
bind_cols(avatar) %>%
group_by(aang) %>%
summarise(across(starts_with("n_"), mean)) %>%
pivot_longer(starts_with("n_"), names_to = "text_feature") %>%
filter(value > 0.01) %>%
mutate(text_feature = fct_reorder(text_feature, -value)) %>%
ggplot(aes(aang, value, fill = aang)) +
geom_col(position = "dodge", alpha = 0.8, show.legend = FALSE) +
facet_wrap(~text_feature, scales = "free", ncol = 6) +
scale_fill_avatar("AirNomads") +
labs(x = NULL, y = "Mean text features per spoken line")
`summarise()` ungrouping output (override with `.groups` argument)
Error in scale_fill_avatar("AirNomads") :
could not find function "scale_fill_avatar"
See more about previous countings here: https://textfeatures.mikewk.com/reference/count_functions.html
Build a model
We can start by loading the tidymodels metapackage, and splitting our data into training and testing sets.
library(tidymodels)
set.seed(123)
avatar_split <- initial_split(avatar, strata = aang)
avatar_train <- training(avatar_split)
avatar_test <- testing(avatar_split)
let’s create cross-validation resamples of the training data, to evaluate our models.
set.seed(234)
avatar_folds <- vfold_cv(avatar_train, strata = aang)
avatar_folds
# 10-fold cross-validation using stratification
let’s preprocess our data to get it ready for modeling.
avatar_prep <- prep(avatar_rec)
avatar_prep <- prep(avatar_rec)
avatar_prep
Data Recipe
Inputs:
Training data contained 7494 data points and no missing data.
Operations:
Down-sampling based on aang [trained]
Text feature extraction for text [trained]
Zero variance filter removed
no non-missing arguments to max; returning -Inf
15 items [trained]
Centering and scaling for 12 items [trained]
juice(avatar_prep)
Let’s walk through the steps in this recipe.
First, we must tell the recipe() what our model is going to be (using a formula here) and what data we are using. Next, we downsample for our predictor, since there are many more lines spoken by characters other than Aang than by Aang. We create the text features using a step from the textrecipes package. Then we remove zero-variance variables, which includes variables like the text features about URLs and hashtags in this case. Finally, we center and scale the predictors because of the specific kind of model we want to try out.
We’re mostly going to use this recipe in a workflow() so we don’t need to stress too much about whether to prep() or not. Since we are going to compute variable importance, we will need to come back to juice(avatar_prep).
Let’s compare two different models, a random forest model and a support vector machine model. We start by creating the model specifications.
rf_spec <- rand_forest(trees = 1000) %>%
set_engine("ranger") %>%
set_mode("classification")
rf_spec
Random Forest Model Specification (classification)
Main Arguments:
trees = 1000
Computational engine: ranger
svm_spec <- svm_rbf(cost = 0.5) %>%
set_engine("kernlab") %>%
set_mode("classification")
svm_spec
Radial Basis Function Support Vector Machine Specification (classification)
Main Arguments:
cost = 0.5
Computational engine: kernlab
Next let’s start putting together a tidymodels workflow(), a helper object to help manage modeling pipelines with pieces that fit together like Lego blocks. Notice that there is no model yet: Model: None.
avatar_wf <- workflow() %>%
add_recipe(avatar_rec)
avatar_wf
══ Workflow ═════════════════════════════════════════════════
Preprocessor: Recipe
Model: None
── Preprocessor ─────────────────────────────────────────────
4 Recipe Steps
● step_downsample()
● step_textfeature()
● step_zv()
● step_normalize()
Now we can add a model, and the fit to each of the resamples. First, we can fit the random forest model.
doParallel::registerDoParallel()
set.seed(1234)
rf_rs <- avatar_wf %>%
add_model(rf_spec) %>%
fit_resamples(
resamples = avatar_folds,
metrics = metric_set(roc_auc, accuracy, sens, spec),
control = control_grid(save_pred = TRUE)
)
Second, we can fit the support vector machine model.
set.seed(2345)
svm_rs <- avatar_wf %>%
add_model(svm_spec) %>%
fit_resamples(
resamples = avatar_folds,
metrics = metric_set(roc_auc, accuracy, sens, spec),
control = control_grid(save_pred = TRUE)
)
Attaching package: ‘kernlab’
The following object is masked from ‘package:scales’:
alpha
The following object is masked from ‘package:purrr’:
cross
The following object is masked from ‘package:ggplot2’:
alpha
We have fit each of our candidate models to our resampled training set.
Evaluate model
collect_metrics(rf_rs)
conf_mat_resampled(rf_rs)
collect_metrics(svm_rs)
conf_mat_resampled(svm_rs)
Different, but not really better! The SVM model is better able to identify the positive cases but at the expense of the negative cases. Overall, we definitely see that this is a hard problem that we barely are able to have any predictive ability for.
Let’s say we are more interested in detecting Aang’s lines, even at the expense of the false positives.
svm_rs %>%
collect_predictions() %>%
group_by(id) %>%
roc_curve(aang, .pred_Aang) %>%
ggplot(aes(1 - specificity, sensitivity, color = id)) +
geom_abline(lty = 2, color = "black", size = 1) +
geom_path(show.legend = FALSE, alpha = 0.6, size = 0.5) +
coord_equal()

This plot highlights how this model is barely doing better than guessing.
Keeping in mind the realities of our model performance, let’s talk about how to compute variable importance for a model like an SVM, which does not have information within it about variable importance like a linear model or a tree-based model. In this case, we can use a method like permutation of the variables.

These are the text features that are most important globally for whether a line was spoken by Aang or not.
Finally, we can return to the testing data to confirm that our performance is about the same.
avatar_final %>%
collect_predictions() %>%
conf_mat(aang, .pred_class)
Truth
Prediction Aang Other
Aang 261 1004
Other 188 1045
LS0tCnRpdGxlOiAiUiBOb3RlYm9vayIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKIyBBaXJiZW5kZXIgcHJlZGljdGl2ZSB0ZXh0CgpPdXIgbW9kZWxpbmcgZ29hbCBpcyB0byBwcmVkaWN0IHRoZSBzcGVha2VyIG9mIGVhY2ggbGluZSBvZiBkaWFsb2d1ZS4KCmh0dHBzOi8vanVsaWFzaWxnZS5jb20vYmxvZy9sYXN0LWFpcmJlbmRlci8KCmBgYHtyfQpsaWJyYXJ5KHRpZHl2ZXJzZSkKYXZhdGFyX3JhdyA8LSByZWFkX2NzdigiaHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL3Jmb3JkYXRhc2NpZW5jZS90aWR5dHVlc2RheS9tYXN0ZXIvZGF0YS8yMDIwLzIwMjAtMDgtMTEvYXZhdGFyLmNzdiIpCgphdmF0YXJfcmF3ICU+JQogIGNvdW50KGNoYXJhY3Rlciwgc29ydCA9IFRSVUUpCmBgYAoKUm93cyB3aXRoIFNjZW5lIERlc2NyaXB0aW9uIGFyZSBub3QgZGlhbG9ndWU7IHRoZSBtYWluIGNoYXJhY3RlciBBYW5nIHNwZWFrcyB0aGUgbW9zdCBsaW5lcyBvdmVyYWxsLiBIb3cgZG9lcyB0aGlzIGNoYW5nZSB0aHJvdWdoIHRoZSB0aHJlZSDigJxib29rc+KAnSBvZiB0aGUgc2hvdz8KCgpgYGB7cn0KbGlicmFyeSh0aWR5dGV4dCkKCmF2YXRhcl9yYXcgJT4lCiAgZmlsdGVyKCFpcy5uYShjaGFyYWN0ZXJfd29yZHMpKSAlPiUKICBtdXRhdGUoCiAgICBib29rID0gZmN0X2lub3JkZXIoYm9vayksCiAgICBjaGFyYWN0ZXIgPSBmY3RfbHVtcF9uKGNoYXJhY3RlciwgMTApCiAgKSAlPiUKICBjb3VudChib29rLCBjaGFyYWN0ZXIpICU+JQogIG11dGF0ZShjaGFyYWN0ZXIgPSByZW9yZGVyX3dpdGhpbihjaGFyYWN0ZXIsIG4sIGJvb2spKSAlPiUKICBnZ3Bsb3QoYWVzKG4sIGNoYXJhY3RlciwgZmlsbCA9IGJvb2spKSArCiAgZ2VvbV9jb2woc2hvdy5sZWdlbmQgPSBGQUxTRSkgKwogIGZhY2V0X3dyYXAofmJvb2ssIHNjYWxlcyA9ICJmcmVlIikgKwogIHNjYWxlX3lfcmVvcmRlcmVkKCkgKwogIGxhYnMoeSA9IE5VTEwpCmBgYAoKTGV04oCZcyBjcmVhdGUgYSBkYXRhc2V0IGZvciBvdXIgbW9kZWxpbmcgcXVlc3Rpb24sIGFuZCBsb29rIGF0IGEgZmV3IGV4YW1wbGUgbGluZXMuCgpgYGB7cn0KYXZhdGFyIDwtIGF2YXRhcl9yYXcgJT4lCiAgZmlsdGVyKCFpcy5uYShjaGFyYWN0ZXJfd29yZHMpKSAlPiUKICBtdXRhdGUoYWFuZyA9IGlmX2Vsc2UoY2hhcmFjdGVyID09ICJBYW5nIiwgIkFhbmciLCAiT3RoZXIiKSkgJT4lCiAgc2VsZWN0KGFhbmcsIGJvb2ssIHRleHQgPSBjaGFyYWN0ZXJfd29yZHMpCgphdmF0YXIgJT4lCiAgZmlsdGVyKGFhbmcgPT0gIkFhbmciKSAlPiUKICBzYW1wbGVfbigxMCkgJT4lCiAgcHVsbCh0ZXh0KQpgYGAKCldoYXQgYXJlIHRoZSBoaWdoZXN0IGxvZyBvZGRzIHdvcmRzIGZyb20gQWFuZyBhbmQgb3RoZXIgc3BlYWtlcnM/CgpgYGB7cn0KbGlicmFyeSh0aWR5dGV4dCkKbGlicmFyeSh0aWR5bG8pCgphdmF0YXJfbG8gPC0gYXZhdGFyICU+JQogIHVubmVzdF90b2tlbnMod29yZCwgdGV4dCkgJT4lCiAgY291bnQoYWFuZywgd29yZCkgJT4lCiAgYmluZF9sb2dfb2RkcyhhYW5nLCB3b3JkLCBuKSAlPiUKICBhcnJhbmdlKC1sb2dfb2Rkc193ZWlnaHRlZCkKCmF2YXRhcl9sbyAlPiUKICBncm91cF9ieShhYW5nKSAlPiUKICB0b3BfbigxNSkgJT4lCiAgdW5ncm91cCgpICU+JQogIG11dGF0ZSh3b3JkID0gcmVvcmRlcih3b3JkLCBsb2dfb2Rkc193ZWlnaHRlZCkpICU+JQogIGdncGxvdChhZXMobG9nX29kZHNfd2VpZ2h0ZWQsIHdvcmQsIGZpbGwgPSBhYW5nKSkgKwogIGdlb21fY29sKGFscGhhID0gMC44LCBzaG93LmxlZ2VuZCA9IEZBTFNFKSArCiAgZmFjZXRfd3JhcCh+YWFuZywgc2NhbGVzID0gImZyZWUiKSArCiAgbGFicyh5ID0gTlVMTCkKYGBgClRoZXNlIHdvcmRzIG1ha2Ugc2Vuc2UsIGJ1dCB0aGUgY291bnRzIGFyZSBwcm9iYWJseSB0b28gbG93IHRvIGJ1aWxkIGEgZ29vZCBtb2RlbCB3aXRoLiBJbnN0ZWFkLCBsZXTigJlzIHRyeSB1c2luZyB0ZXh0IGZlYXR1cmVzIGxpa2UgdGhlIG51bWJlciBvZiBwdW5jdHVhdGlvbiBjaGFyYWN0ZXJzLCBudW1iZXIgb2YgcHJvbm9ucywgYW5kIHNvIGZvcnRoLgoKYGBge3J9CmxpYnJhcnkodGV4dGZlYXR1cmVzKQoKdGYgPC0gdGV4dGZlYXR1cmVzKAogIGF2YXRhciwKICBzZW50aW1lbnQgPSBGQUxTRSwgd29yZF9kaW1zID0gMCwKICBub3JtYWxpemUgPSBGQUxTRSwgdmVyYm9zZSA9IEZBTFNFCikKCnRmICU+JQogIGJpbmRfY29scyhhdmF0YXIpICU+JQogIGdyb3VwX2J5KGFhbmcpICU+JQogIHN1bW1hcmlzZShhY3Jvc3Moc3RhcnRzX3dpdGgoIm5fIiksIG1lYW4pKSAlPiUKICBwaXZvdF9sb25nZXIoc3RhcnRzX3dpdGgoIm5fIiksIG5hbWVzX3RvID0gInRleHRfZmVhdHVyZSIpICU+JQogIGZpbHRlcih2YWx1ZSA+IDAuMDEpICU+JQogIG11dGF0ZSh0ZXh0X2ZlYXR1cmUgPSBmY3RfcmVvcmRlcih0ZXh0X2ZlYXR1cmUsIC12YWx1ZSkpICU+JQogIGdncGxvdChhZXMoYWFuZywgdmFsdWUsIGZpbGwgPSBhYW5nKSkgKwogIGdlb21fY29sKHBvc2l0aW9uID0gImRvZGdlIiwgYWxwaGEgPSAwLjgsIHNob3cubGVnZW5kID0gRkFMU0UpICsKICBmYWNldF93cmFwKH50ZXh0X2ZlYXR1cmUsIHNjYWxlcyA9ICJmcmVlIiwgbmNvbCA9IDYpICsKICBzY2FsZV9maWxsX2F2YXRhcigiQWlyTm9tYWRzIikgKwogIGxhYnMoeCA9IE5VTEwsIHkgPSAiTWVhbiB0ZXh0IGZlYXR1cmVzIHBlciBzcG9rZW4gbGluZSIpCmBgYAoKU2VlIG1vcmUgYWJvdXQgcHJldmlvdXMgY291bnRpbmdzIGhlcmU6IGh0dHBzOi8vdGV4dGZlYXR1cmVzLm1pa2V3ay5jb20vcmVmZXJlbmNlL2NvdW50X2Z1bmN0aW9ucy5odG1sCgojIyBCdWlsZCBhIG1vZGVsCgpXZSBjYW4gc3RhcnQgYnkgbG9hZGluZyB0aGUgdGlkeW1vZGVscyBtZXRhcGFja2FnZSwgYW5kIHNwbGl0dGluZyBvdXIgZGF0YSBpbnRvIHRyYWluaW5nIGFuZCB0ZXN0aW5nIHNldHMuCgpgYGB7cn0KbGlicmFyeSh0aWR5bW9kZWxzKQoKc2V0LnNlZWQoMTIzKQphdmF0YXJfc3BsaXQgPC0gaW5pdGlhbF9zcGxpdChhdmF0YXIsIHN0cmF0YSA9IGFhbmcpCmF2YXRhcl90cmFpbiA8LSB0cmFpbmluZyhhdmF0YXJfc3BsaXQpCmF2YXRhcl90ZXN0IDwtIHRlc3RpbmcoYXZhdGFyX3NwbGl0KQpgYGAKCmxldOKAmXMgY3JlYXRlIGNyb3NzLXZhbGlkYXRpb24gcmVzYW1wbGVzIG9mIHRoZSB0cmFpbmluZyBkYXRhLCB0byBldmFsdWF0ZSBvdXIgbW9kZWxzLgoKYGBge3J9CnNldC5zZWVkKDIzNCkKYXZhdGFyX2ZvbGRzIDwtIHZmb2xkX2N2KGF2YXRhcl90cmFpbiwgc3RyYXRhID0gYWFuZykKYXZhdGFyX2ZvbGRzCmBgYApsZXTigJlzIHByZXByb2Nlc3Mgb3VyIGRhdGEgdG8gZ2V0IGl0IHJlYWR5IGZvciBtb2RlbGluZy4KCmBgYHtyfQpsaWJyYXJ5KHRleHRyZWNpcGVzKQpsaWJyYXJ5KHRoZW1pcykKCmF2YXRhcl9yZWMgPC0gcmVjaXBlKGFhbmcgfiB0ZXh0LCBkYXRhID0gYXZhdGFyX3RyYWluKSAlPiUKICBzdGVwX2Rvd25zYW1wbGUoYWFuZykgJT4lCiAgc3RlcF90ZXh0ZmVhdHVyZSh0ZXh0KSAlPiUKICBzdGVwX3p2KGFsbF9wcmVkaWN0b3JzKCkpICU+JQogIHN0ZXBfbm9ybWFsaXplKGFsbF9wcmVkaWN0b3JzKCkpCgphdmF0YXJfcHJlcCA8LSBwcmVwKGF2YXRhcl9yZWMpCmF2YXRhcl9wcmVwCmBgYApgYGB7cn0KanVpY2UoYXZhdGFyX3ByZXApCmBgYAoKTGV04oCZcyB3YWxrIHRocm91Z2ggdGhlIHN0ZXBzIGluIHRoaXMgcmVjaXBlLgoKRmlyc3QsIHdlIG11c3QgdGVsbCB0aGUgcmVjaXBlKCkgd2hhdCBvdXIgbW9kZWwgaXMgZ29pbmcgdG8gYmUgKHVzaW5nIGEgZm9ybXVsYSBoZXJlKSBhbmQgd2hhdCBkYXRhIHdlIGFyZSB1c2luZy4KTmV4dCwgd2UgZG93bnNhbXBsZSBmb3Igb3VyIHByZWRpY3Rvciwgc2luY2UgdGhlcmUgYXJlIG1hbnkgbW9yZSBsaW5lcyBzcG9rZW4gYnkgY2hhcmFjdGVycyBvdGhlciB0aGFuIEFhbmcgdGhhbiBieSBBYW5nLgpXZSBjcmVhdGUgdGhlIHRleHQgZmVhdHVyZXMgdXNpbmcgYSBzdGVwIGZyb20gdGhlIHRleHRyZWNpcGVzIHBhY2thZ2UuClRoZW4gd2UgcmVtb3ZlIHplcm8tdmFyaWFuY2UgdmFyaWFibGVzLCB3aGljaCBpbmNsdWRlcyB2YXJpYWJsZXMgbGlrZSB0aGUgdGV4dCBmZWF0dXJlcyBhYm91dCBVUkxzIGFuZCBoYXNodGFncyBpbiB0aGlzIGNhc2UuCkZpbmFsbHksIHdlIGNlbnRlciBhbmQgc2NhbGUgdGhlIHByZWRpY3RvcnMgYmVjYXVzZSBvZiB0aGUgc3BlY2lmaWMga2luZCBvZiBtb2RlbCB3ZSB3YW50IHRvIHRyeSBvdXQuCgpXZeKAmXJlIG1vc3RseSBnb2luZyB0byB1c2UgdGhpcyByZWNpcGUgaW4gYSB3b3JrZmxvdygpIHNvIHdlIGRvbuKAmXQgbmVlZCB0byBzdHJlc3MgdG9vIG11Y2ggYWJvdXQgd2hldGhlciB0byBwcmVwKCkgb3Igbm90LiBTaW5jZSB3ZSBhcmUgZ29pbmcgdG8gY29tcHV0ZSB2YXJpYWJsZSBpbXBvcnRhbmNlLCB3ZSB3aWxsIG5lZWQgdG8gY29tZSBiYWNrIHRvIGp1aWNlKGF2YXRhcl9wcmVwKS4KCkxldOKAmXMgY29tcGFyZSB0d28gZGlmZmVyZW50IG1vZGVscywgYSByYW5kb20gZm9yZXN0IG1vZGVsIGFuZCBhIHN1cHBvcnQgdmVjdG9yIG1hY2hpbmUgbW9kZWwuIFdlIHN0YXJ0IGJ5IGNyZWF0aW5nIHRoZSBtb2RlbCBzcGVjaWZpY2F0aW9ucy4KCgpgYGB7cn0KcmZfc3BlYyA8LSByYW5kX2ZvcmVzdCh0cmVlcyA9IDEwMDApICU+JQogIHNldF9lbmdpbmUoInJhbmdlciIpICU+JQogIHNldF9tb2RlKCJjbGFzc2lmaWNhdGlvbiIpCgpyZl9zcGVjCmBgYAoKYGBge3J9CnN2bV9zcGVjIDwtIHN2bV9yYmYoY29zdCA9IDAuNSkgJT4lCiAgc2V0X2VuZ2luZSgia2VybmxhYiIpICU+JQogIHNldF9tb2RlKCJjbGFzc2lmaWNhdGlvbiIpCgpzdm1fc3BlYwpgYGAKCk5leHQgbGV04oCZcyBzdGFydCBwdXR0aW5nIHRvZ2V0aGVyIGEgdGlkeW1vZGVscyB3b3JrZmxvdygpLCBhIGhlbHBlciBvYmplY3QgdG8gaGVscCBtYW5hZ2UgbW9kZWxpbmcgcGlwZWxpbmVzIHdpdGggcGllY2VzIHRoYXQgZml0IHRvZ2V0aGVyIGxpa2UgTGVnbyBibG9ja3MuIE5vdGljZSB0aGF0IHRoZXJlIGlzIG5vIG1vZGVsIHlldDogTW9kZWw6IE5vbmUuCgpgYGB7cn0KYXZhdGFyX3dmIDwtIHdvcmtmbG93KCkgJT4lCiAgYWRkX3JlY2lwZShhdmF0YXJfcmVjKQoKYXZhdGFyX3dmCmBgYAoKTm93IHdlIGNhbiBhZGQgYSBtb2RlbCwgYW5kIHRoZSBmaXQgdG8gZWFjaCBvZiB0aGUgcmVzYW1wbGVzLiBGaXJzdCwgd2UgY2FuIGZpdCB0aGUgcmFuZG9tIGZvcmVzdCBtb2RlbC4KCmBgYHtyfQpkb1BhcmFsbGVsOjpyZWdpc3RlckRvUGFyYWxsZWwoKQoKc2V0LnNlZWQoMTIzNCkKcmZfcnMgPC0gYXZhdGFyX3dmICU+JQogIGFkZF9tb2RlbChyZl9zcGVjKSAlPiUKICBmaXRfcmVzYW1wbGVzKAogICAgcmVzYW1wbGVzID0gYXZhdGFyX2ZvbGRzLAogICAgbWV0cmljcyA9IG1ldHJpY19zZXQocm9jX2F1YywgYWNjdXJhY3ksIHNlbnMsIHNwZWMpLAogICAgY29udHJvbCA9IGNvbnRyb2xfZ3JpZChzYXZlX3ByZWQgPSBUUlVFKQogICkKYGBgCgoKU2Vjb25kLCB3ZSBjYW4gZml0IHRoZSBzdXBwb3J0IHZlY3RvciBtYWNoaW5lIG1vZGVsLgoKYGBge3J9CnNldC5zZWVkKDIzNDUpCnN2bV9ycyA8LSBhdmF0YXJfd2YgJT4lCiAgYWRkX21vZGVsKHN2bV9zcGVjKSAlPiUKICBmaXRfcmVzYW1wbGVzKAogICAgcmVzYW1wbGVzID0gYXZhdGFyX2ZvbGRzLAogICAgbWV0cmljcyA9IG1ldHJpY19zZXQocm9jX2F1YywgYWNjdXJhY3ksIHNlbnMsIHNwZWMpLAogICAgY29udHJvbCA9IGNvbnRyb2xfZ3JpZChzYXZlX3ByZWQgPSBUUlVFKQogICkKYGBgCldlIGhhdmUgZml0IGVhY2ggb2Ygb3VyIGNhbmRpZGF0ZSBtb2RlbHMgdG8gb3VyIHJlc2FtcGxlZCB0cmFpbmluZyBzZXQuCgojIyBFdmFsdWF0ZSBtb2RlbAoKYGBge3J9CmNvbGxlY3RfbWV0cmljcyhyZl9ycykKYGBgCgpgYGB7cn0KY29uZl9tYXRfcmVzYW1wbGVkKHJmX3JzKQpgYGAKCmBgYHtyfQpjb2xsZWN0X21ldHJpY3Moc3ZtX3JzKQpgYGAKCmBgYHtyfQpjb25mX21hdF9yZXNhbXBsZWQoc3ZtX3JzKQpgYGAKCkRpZmZlcmVudCwgYnV0IG5vdCByZWFsbHkgYmV0dGVyISBUaGUgU1ZNIG1vZGVsIGlzIGJldHRlciBhYmxlIHRvIGlkZW50aWZ5IHRoZSBwb3NpdGl2ZSBjYXNlcyBidXQgYXQgdGhlIGV4cGVuc2Ugb2YgdGhlIG5lZ2F0aXZlIGNhc2VzLiBPdmVyYWxsLCB3ZSBkZWZpbml0ZWx5IHNlZSB0aGF0IHRoaXMgaXMgYSBoYXJkIHByb2JsZW0gdGhhdCB3ZSBiYXJlbHkgYXJlIGFibGUgdG8gaGF2ZSBhbnkgcHJlZGljdGl2ZSBhYmlsaXR5IGZvci4KCkxldOKAmXMgc2F5IHdlIGFyZSBtb3JlIGludGVyZXN0ZWQgaW4gZGV0ZWN0aW5nIEFhbmfigJlzIGxpbmVzLCBldmVuIGF0IHRoZSBleHBlbnNlIG9mIHRoZSBmYWxzZSBwb3NpdGl2ZXMuCgpgYGB7cn0Kc3ZtX3JzICU+JQogIGNvbGxlY3RfcHJlZGljdGlvbnMoKSAlPiUKICBncm91cF9ieShpZCkgJT4lCiAgcm9jX2N1cnZlKGFhbmcsIC5wcmVkX0FhbmcpICU+JQogIGdncGxvdChhZXMoMSAtIHNwZWNpZmljaXR5LCBzZW5zaXRpdml0eSwgY29sb3IgPSBpZCkpICsKICBnZW9tX2FibGluZShsdHkgPSAyLCBjb2xvciA9ICJibGFjayIsIHNpemUgPSAxKSArCiAgZ2VvbV9wYXRoKHNob3cubGVnZW5kID0gRkFMU0UsIGFscGhhID0gMC42LCBzaXplID0gMC41KSArCiAgY29vcmRfZXF1YWwoKQpgYGAKClRoaXMgcGxvdCBoaWdobGlnaHRzIGhvdyB0aGlzIG1vZGVsIGlzIGJhcmVseSBkb2luZyBiZXR0ZXIgdGhhbiBndWVzc2luZy4KCktlZXBpbmcgaW4gbWluZCB0aGUgcmVhbGl0aWVzIG9mIG91ciBtb2RlbCBwZXJmb3JtYW5jZSwgbGV04oCZcyB0YWxrIGFib3V0IGhvdyB0byBjb21wdXRlIHZhcmlhYmxlIGltcG9ydGFuY2UgZm9yIGEgbW9kZWwgbGlrZSBhbiBTVk0sIHdoaWNoIGRvZXMgbm90IGhhdmUgaW5mb3JtYXRpb24gd2l0aGluIGl0IGFib3V0IHZhcmlhYmxlIGltcG9ydGFuY2UgbGlrZSBhIGxpbmVhciBtb2RlbCBvciBhIHRyZWUtYmFzZWQgbW9kZWwuIEluIHRoaXMgY2FzZSwgd2UgY2FuIHVzZSBhIG1ldGhvZCBsaWtlIHBlcm11dGF0aW9uIG9mIHRoZSB2YXJpYWJsZXMuCgpgYGB7cn0KbGlicmFyeSh2aXApCgpzZXQuc2VlZCgzNDUpCmF2YXRhcl9pbXAgPC0gYXZhdGFyX3dmICU+JQogIGFkZF9tb2RlbChzdm1fc3BlYykgJT4lCiAgZml0KGF2YXRhcl90cmFpbikgJT4lCiAgcHVsbF93b3JrZmxvd19maXQoKSAlPiUKICB2aSgKICAgIG1ldGhvZCA9ICJwZXJtdXRlIiwgbnNpbSA9IDEwLAogICAgdGFyZ2V0ID0gImFhbmciLCBtZXRyaWMgPSAiYXVjIiwgcmVmZXJlbmNlX2NsYXNzID0gIk90aGVyIiwKICAgIHByZWRfd3JhcHBlciA9IGtlcm5sYWI6OnByZWRpY3QsIHRyYWluID0ganVpY2UoYXZhdGFyX3ByZXApCiAgKQoKYXZhdGFyX2ltcCAlPiUKICBzbGljZV9tYXgoSW1wb3J0YW5jZSwgbiA9IDgpICU+JQogIG11dGF0ZSgKICAgIFZhcmlhYmxlID0gc3RyX3JlbW92ZShWYXJpYWJsZSwgInRleHRmZWF0dXJlX3RleHRfbl8iKSwKICAgIFZhcmlhYmxlID0gZmN0X3Jlb3JkZXIoVmFyaWFibGUsIEltcG9ydGFuY2UpCiAgKSAlPiUKICBnZ3Bsb3QoYWVzKEltcG9ydGFuY2UsIFZhcmlhYmxlLCBjb2xvciA9IFZhcmlhYmxlKSkgKwogIGdlb21fZXJyb3JiYXIoYWVzKHhtaW4gPSBJbXBvcnRhbmNlIC0gU3REZXYsIHhtYXggPSBJbXBvcnRhbmNlICsgU3REZXYpLAogICAgYWxwaGEgPSAwLjUsIHNpemUgPSAwLjUKICApICsKICBnZW9tX3BvaW50KHNpemUgPSAyKSArCiAgdGhlbWUobGVnZW5kLnBvc2l0aW9uID0gIm5vbmUiKSArCiAgbGFicyh5ID0gTlVMTCkKYGBgClRoZXNlIGFyZSB0aGUgdGV4dCBmZWF0dXJlcyB0aGF0IGFyZSBtb3N0IGltcG9ydGFudCBnbG9iYWxseSBmb3Igd2hldGhlciBhIGxpbmUgd2FzIHNwb2tlbiBieSBBYW5nIG9yIG5vdC4KCkZpbmFsbHksIHdlIGNhbiByZXR1cm4gdG8gdGhlIHRlc3RpbmcgZGF0YSB0byBjb25maXJtIHRoYXQgb3VyIHBlcmZvcm1hbmNlIGlzIGFib3V0IHRoZSBzYW1lLgoKYGBge3J9CmF2YXRhcl9maW5hbCA8LSBhdmF0YXJfd2YgJT4lCiAgYWRkX21vZGVsKHN2bV9zcGVjKSAlPiUKICBsYXN0X2ZpdChhdmF0YXJfc3BsaXQpCgphdmF0YXJfZmluYWwgJT4lCiAgY29sbGVjdF9tZXRyaWNzKCkKYGBgCgpgYGB7cn0KYXZhdGFyX2ZpbmFsICU+JQogIGNvbGxlY3RfcHJlZGljdGlvbnMoKSAlPiUKICBjb25mX21hdChhYW5nLCAucHJlZF9jbGFzcykKYGBgCgoKCgoKCgoKCgoKCgoK